Why is this subject?





“If you look at the number of Americans killed since 9/11 by terrorism, it’s less than 100. If you look at the number been killed by gun violence, it’s in the tens of thousands.”



Source: NBCNews

DATA

Words on my data set

DATA (cont.)

my_data <- read.csv('Mother Jones - Mass Shootings Database, 1982 - 2023 - Sheet1.csv', na.strings = "-")
my_data2 <- read.csv('Violence Project Mass Shooter Database - Version 6.1 - Full Database.csv', na.strings = "-")

A Glimpse over the Data

glimpse(my_data)
## Rows: 164
## Columns: 26
## $ case                             <chr> "Louisville bank shooting", "Nashvill…
## $ city                             <chr> "Louisville", "Nashville", "East Lans…
## $ state                            <chr> "KY", "TN", "MI", "CA", "CA", "VA", "…
## $ date                             <chr> "4/10/2023", "3/27/2023", "2/13/2023"…
## $ summary                          <chr> "Connor Sturgeon, 25, opened fire ins…
## $ fatalities                       <int> 5, 6, 3, 7, 11, 6, 5, 3, 5, 3, 7, 3, …
## $ injured                          <int> 8, 6, 5, 1, 10, 6, 25, 2, 2, 2, 46, 0…
## $ total_victims                    <int> 13, 12, 8, 8, 21, 12, 30, 5, 7, 5, 53…
## $ location                         <chr> "workplace", "School", "School", "wor…
## $ age_of_shooter                   <int> 25, 28, 43, 67, 72, 31, 22, 22, 15, 2…
## $ prior_signs_mental_health_issues <chr> "Yes", NA, NA, NA, "Yes", NA, "Yes", …
## $ mental_health_details            <chr> NA, NA, NA, NA, "According to the LA …
## $ weapons_obtained_legally         <chr> "Yes", "Yes", "Yes", NA, NA, NA, NA, …
## $ where_obtained                   <chr> "gun dealership in Louisville", NA, N…
## $ weapon_type                      <chr> "Semiautomatic Rifle", "One Semiautom…
## $ weapon_details                   <chr> "AR-15 rifle", NA, NA, NA, NA, NA, NA…
## $ race                             <chr> "White", "White", "Black", "Asian", "…
## $ gender                           <chr> "M", "F (\"identifies as transgender\…
## $ sources                          <chr> "https://apnews.com/article/downtown-…
## $ mental_health_sources            <chr> NA, NA, NA, NA, "https://www.latimes.…
## $ sources_additional_age           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ latitude                         <dbl> NA, NA, NA, NA, NA, 36.77262, 38.8809…
## $ longitude                        <dbl> NA, NA, NA, NA, NA, -76.25128, -104.7…
## $ type                             <chr> "Mass", "Mass", "Mass", "Spree", "Mas…
## $ year                             <int> 2023, 2023, 2023, 2023, 2023, 2022, 2…
## $ day_of_week                      <chr> "Monday", "Monday", "Monday", "Monday…

Data Key Terms

Case: Case’s name well-known by the media

City/State: Location where the incidents happened

Date/Year/Date of Week: Specific day and year when the incidents occurred

Summary: Summary about the case

Fatalities/Injured/Total Victims: Facts that stand out from the incident

Location: Type of location where the incidents occurred

Age of Shooter/Race/Gender/Prior Sign Mental Issues: The shooter profile

Weapons Obtained Legally/Where Obtained/Weapons Type/Weapons Details: Weapons profile

Sources/Mental Health Sources/Additional Age Source: All sources that were used to conduct this data set.

Longitude/Latitude: GPS coordination of the incident’s location

Type: Mass Shooting or Shooting Spree designated to the incident

Packages

#install.packages(tidyverse)
library(dplyr)
library(tidyverse)

Data Wrangling

# Assign the variables to the data type of my choice. 
 my_data$age_of_shooter <- as.integer(my_data$age_of_shooter)
 my_data$fatalities <- as.integer(my_data$fatalities)
 my_data$injured <- as.integer(my_data$injured)
 my_data$total_victims <- as.integer(my_data$total_victims)
 my_data$latitude <- as.numeric(my_data$latitude)
 my_data$longitude <- as.numeric(my_data$longitude)
# remove newline 
my_data$location <- str_replace_all(my_data$location,'[\r\n]','')
my_data$race <- str_replace_all(my_data$race,'[\r\n]','')


# replace a string by another one
my_data$location[my_data$location == 'religious' | my_data$location == 'Religious'] <- 'Religious Place'
my_data$location[my_data$location == 'workplace'] <- 'Workplace'

my_data$race[my_data$race == 'unclear'] <- 'Unclear'
my_data$race[my_data$race == 'black'] <- 'Black'
my_data$race[my_data$race == 'white'] <- 'White'

my_data$gender[2] <- 'Trans'
my_data$gender[my_data$gender == 'M'] <- 'Male'
my_data$gender[my_data$gender == 'F'] <- 'Female'

What’s The Mass Shooting?


The FBI defines a mass shooting as any incidents in which at least four people are murdered with a gun.


Source: DOJ

What questions should be raised?

  1. Will the time frame would say anything about the incidents in general?

  2. Would the age, race, and gender give any insights about the shooter’s profile?

  3. What would stand out if we cross the shooter with prior mental health issues out of the equation?

  4. Where are the locations that the incidents likely take place?

  5. What types of weapons the assaiants likely use?

  6. What conclusion about the age of shooter, race and prior mental health issues could we draw?

  7. What is interesting about the connection between age of the shooter over year?

  8. Will gender play any roles in corresponding to age of the shooter?

  9. How have the incidents distributed across the America?

A First Glance about the Incidents over Years

Data Summary

##  age_of_shooter   fatalities      injured       total_victims   
##  Min.   :11.0   Min.   : 3.0   Min.   :  0.00   Min.   :  3.00  
##  1st Qu.:23.0   1st Qu.: 4.0   1st Qu.:  1.00   1st Qu.:  6.00  
##  Median :32.0   Median : 6.0   Median :  3.00   Median : 10.00  
##  Mean   :33.9   Mean   : 7.5   Mean   : 10.56   Mean   : 18.09  
##  3rd Qu.:43.0   3rd Qu.: 8.0   3rd Qu.:  9.50   3rd Qu.: 16.00  
##  Max.   :72.0   Max.   :58.0   Max.   :546.00   Max.   :604.00  
##                                NA's   :1        NA's   :1

1. Las Vegas Strip Massacre: 604 victims

2. LA Dance Studio Mass Shooting: Oldest age for a mass shooter

3. West Middle School Killings: Youngest age

Who are they?

The Picture in General

Before 2002

After 2002

  • Since 2002 is the year without any major incidents about the mass shooting, I chose it as a reference point for my split stats.

  • Before 2002, the story seems to be about some certain races, but after 2002, it becomes all the races’ story.

  • Remember there are four decades before 2002, and only two decades after 2002, but the cases after 2002 shoot up more than double before 2002, 55 versus 109 respectively.

The Average Age of the Shooters among Races

Race Average Age of the Shooters
White ~ 28-29 years old
Latino ~ 32-33 years old
Black ~ 38-39 years old
Asian ~ 41 years old
Native Am. ~ 18 years old

Gender of the Shooters

Gender Percentage
Male 97%
Female 2.5%
Transgender 0.5%

Gender and Age of the Shooters

Gender Most Likely
Male Early 20’s to Mid 40’s
Female Mid 20’s or Mid 40’s

Where Do The Mass Shootings Likely Occur?

Location Frequency
Workplace ~ 34%
School ~ 16%
Bar/Club/Rest. ~ 11%
Retail ~ 10%
Other ~ 9%
Religious Place ~ 6%

It is heartbroken to see School is second rank on the list, which means a lot of innocent kids got their future ahead taken.

What Weapons Were Likely Used by The Assailants?

Firearm Percent of Carrying
Semi-Auto Handgun ~ 41%
Semi-Auto Rifle ~ 20%
Handgun(Old Versions) ~ 6.7%
Rifle(Old Version) ~ 5%
Assault Rifle ~ 5%
Shotgun ~ 4%

Race and Mental Health Issues

Race Prior Mental Health Issues
Asian 90%
White 69%
Latino 67%
Black 43%

With Prior Mental Health Issues

Now, the shooters with the signs of the prior mental issues will be added to the graph.

The incidents in which the shooter had prior mental health issues have plotted as the plus sign (+) on the plot above.

Without Prior Mental Health Issues Plot

Now we take those cases out of the plot to see how the current plot looks like.

Compare to the original plot, we can see intuitively the dots’ density was reduced significantly. Hence, we are going to find the difference between with and without prior mental health issues by numbers.

Realize Intuitive Consideration by Numbers

We can draw some remarks by spotting the plots. Now we are going to consider some numbers from the data to see how much the mental health issues contribute to the problem.

my_data %>%
  filter(prior_signs_mental_health_issues == "Yes") %>% 
  group_by(year) %>%
  nrow()
## [1] 80

If we filter out the cases with the prior mental health issues, there are eighty cases was off the chart, which is almost half of cases of mass shooting in the US since 1960.

my_data %>% 
  filter(weapons_obtained_legally == "No") %>% 
  group_by(year) %>% 
  nrow()
## [1] 16

In a different case, I cross off the legal weapons obtained, only 16 cases was off the chart, which roughly 10% of all of the cases.

Conclusion:

People always argued about either we should do the background check or adjust the law over gun control. Now we can state that background check is more important than gun control, especially background check on mental health issues is crucial. Decreasing the cases down to fifty percent is ideal, but twenty or thirty percent down is sufficient to save many lives.

Prior Signs of Mental Health Issues Age of the Shooter
Yes around 23 and 40
No around 30


Geography Graph of The Events

The graph shows us an idea that the incidents most likely occurs over the East and West side of the country, and the Mid-west is least likely to happen the mass shootings.

Colorado is the state in top 5 rating of mass shooting even the population rank is not in top 20 nationwide.

Massachusetts surprisingly has no records on mass shooting even the population is in top 16 nationwide.

State Population Source: https://www.statsamerica.org/sip/rank_list.aspx?rank_label=pop1

Pick Your Day to Go Out.

Sunday is the deadliest day of the week in term of Mass Shooting but Monday is the most likely day for the Mass Shooter plan to act.